Search for: All records

Creators/Authors contains: "Pissis, Solon P."

« Prev Next »

Total Resources

5

Resource Type
Conference Paper

2

Conference Proceeding

0

Dataset

0

Journal Article

3

Workshop Report

0

Availability
Full Text / Resource Available

5

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Differentially Private String Sanitization for Frequency-Based Mining Tasks

https://doi.org/10.1109/ICDM51629.2021.00014

Chen, Huiping ; Dong, Changyu ; Fan, Liyue ; Loukides, Grigorios ; Pissis, Solon P. ; Stougie, Leen ( December 2021 , 2021 IEEE International Conference on Data Mining (ICDM))

Full Text Available
Efficient Computation of Sequence Mappability

https://doi.org/10.1007/s00453-022-00934-y

Charalampopoulos, Panagiotis ; Iliopoulos, Costas S. ; Kociumaka, Tomasz ; Pissis, Solon P. ; Radoszewski, Jakub ; Straszyński, Juliusz ( February 2022 , Algorithmica)

Abstract
Sequence mappability is an important task in genome resequencing. In the (k, m)-mappability problem, for a given sequenceTof lengthn, the goal is to compute a table whoseith entry is the number of indices$$j \ne i$$ $j \neq i$ such that the length-msubstrings ofTstarting at positionsiandjhave at mostkmismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of$$k=1$$ $k = 1$ . We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that, for$$k=O(1)$$ $k = O (1)$ , works in$$O(n)$$ $O (n)$ space and, with high probability, in$$O(n \cdot \min \{m^k,\log ^k n\})$$ $O (n \cdot min {m^{k}, {log}^{k} n})$ time. Our algorithm requires a careful adaptation of thek-errata trees of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. Our technique can also be applied to solve the all-pairs Hamming distance problem introduced by Crochemore et al. [WABI 2017]. We further develop$$O(n^2)$$ $O (n^{2})$ -time algorithms to computeall(k, m)-mappability tables for a fixedmand all$$k\in \{0,\ldots ,m\}$$ $k \in {0, \dots, m}$ or a fixedkand all$$m\in \{k,\ldots ,n\}$$ $m \in {k, \dots, n}$ . Finally, we show that, for$$k,m = \Theta (\log n)$$ $k, m = Θ (log n)$ , the (k, m)-mappability problem cannot be solved in strongly subquadratic time unless the Strong Exponential Time Hypothesis fails. This is an improved and extended version of a paper presented at SPIRE 2018.

more » « less
Efficient Data Structures for Range Shortest Unique Substring Queries

https://doi.org/10.3390/a13110276

Abedin, Paniz ; Ganguly, Arnab ; Pissis, Solon P. ; Thankachan, Sharma V. ( November 2020 , Algorithms)
null (Ed.)
Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α,β], return a shortest substring T[i,j] of T with exactly one occurrence in [α,β]. We present an O(nlogn)-word data structure with O(logwn) query time, where w=Ω(logn) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(nlogϵn) query time, where ϵ>0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012].
more » « less
Full Text Available
Range Shortest Unique Substring Queries

https://doi.org/10.1007/978-3-030-32686-9_18

Abedin, Paniz ; Ganguly, Arnab ; Pissis, Solon P. ; Thankachan, Sharma V. ( January 2019 , String Processing and Information Retrieval (SPIRE))

Full Text Available
On Computing Average Common Substring Over Run Length Encoded Sequences

https://doi.org/10.3233/FI-2018-1743

Hooshmand, Sahar ; Tavakoli, Neda ; Abedin, Paniz ; Thankachan, Sharma V. ; Charalampopoulos, Panagiotis ; Crochemore, Maxime ; Pissis, Solon P. ( November 2018 , Fundamenta Informaticae)

Full Text Available